The Autumn reprocessing is imminent!

19 October 2010



While it feels like the rush for the summer conferences is just over, the people involved in the reprocessing campaign are already working hard towards meeting the deadline for the winter conferences in 2011.

Reprocessing the ATLAS data is an exercise in trying to resolve the tension between having stable reconstruction software for physics analysis and taking advantage of the constant improvements of the software algorithms, the calibration and alignment improvements (stored in so called conditions data) and in the detector simulation. A full reprocessing campaign involves reconstructing all the raw ATLAS data and the Monte Carlo samples again with an updated software release. The benefits for the physics program are obvious: an improved and completely uniform dataset that can be used for analysis.

The current autumn reprocessing campaign is breaking new ground. For the first time, the number of data events to be processed is larger than the number of Monte Carlo events. Adding the number of events in all streams, there are close to one billion data events to be reprocessed in this campaign. It is also the first time we have a chance to incorporate our improved understanding of the detector that we have gained from looking at the high-energy data collected during the six months since the first 7 TeV collisions in April. The analysis of the data has for example led to a much improved knowledge of the alignment of the Inner Detector, the Calorimeters and the Muon Spectrometer. From start to finish, a reprocessing campaign is going through several phases involving experts from a wide range of areas. It usually starts with the building of a new release. In the case of the autumn reprocessing, we are going to use release 16 which had a deadline for software improvements at the end of August. Then follows a few weeks of building the release and doing the initial technical validation, which means making sure that the release compiles and that it runs without crashing. Once this has been achieved, the physics validation team gets involved and starts the daunting task of identifying all the problems with the new reconstruction. They act as a safety net for the software developers and the conditions experts as they can catch unwanted degradations of the quality of the data. Once they are satisfied that the new reconstruction looks good, a final sign-off from all ATLAS Data Quality, physics and combined performance groups is done before the reprocessing is launched.

This is the stage where the autumn reprocessing campaign currently is. As a final full dress rehearsal of the release, the express stream for all runs with stable beams has been reprocessed on the GRID and the Data Quality, physics and combined performance groups are currently looking at the data before providing the final sign-off. Once the go-ahead signal is given, the reprocessing campaign will enter the production phase and the jobs will be launched to the GRID. With close to one billion events to be reconstructed and an average reconstruction time of around 15 seconds per event, it would take a single computer almost 500 years to complete all the jobs. Fortunately our computing model involves parallel processing and by using a total of around 10,000 CPUs distributed in the various Tier1 sites all over the world, we can expect to complete the production in three to four weeks after launching the jobs. That brings us to the end of November when we will hand over the improved dataset to the various physics groups for them to start their own mad rush towards the deadline for the winter conferences.

 

Jonas Strandberg
University of Michigan